Skip to content

Conversation

@Prucek
Copy link
Member

@Prucek Prucek commented Oct 30, 2025

This PR adds more metrics to see how much time tide spends retesting.

🤖 Assisted by Claude.

@netlify
Copy link

netlify bot commented Oct 30, 2025

Deploy Preview for k8s-prow ready!

Name Link
🔨 Latest commit 8f0cb89
🔍 Latest deploy log https://app.netlify.com/projects/k8s-prow/deploys/69048cd83e6f770008e24316
😎 Deploy Preview https://deploy-preview-539--k8s-prow.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/tide Issues or PRs related to prow's tide component labels Oct 30, 2025
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 30, 2025
@Prucek
Copy link
Member Author

Prucek commented Oct 30, 2025

/test pull-prow-unit-test-race-detector-nonblocking

pkg/tide/tide.go Outdated
Comment on lines 300 to 308
retestsByAction: prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "tide_retests_by_action_total",
Help: "Total number of retests by action type (TRIGGER for serial, TRIGGER_BATCH for batch). Helps identify whether batch or serial testing is causing more retests.",
}, []string{
"org",
"repo",
"branch",
"action",
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we need both tide_retests_by_action_total and tide_retests_total, if you need the latter you can get it from the former by aggregating. We should only have tide_retests_total but let it have action label.

pkg/tide/tide.go Outdated
Comment on lines 1792 to 1793
if len(batchPending) == 0 && len(missings) > 0 && len(pendings) == 0 {
if act == Trigger {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand why this indicates a batch failure? I think we can be in this state simply because the HEAD moved and now a PR needs to be (maybe individually, not necessarily in a batch) retested?

pkg/tide/tide.go Outdated
Comment on lines 309 to 316
batchFailures: prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "tide_batch_failures_total",
Help: "Total number of times a batch test completes but PRs move back to missing state (indicating batch failure). This is expensive as all PRs in the batch need retesting.",
}, []string{
"org",
"repo",
"branch",
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to be a Tide metric? Batch jobs are a separate type of a ProwJob, I guess we already have that number from crier (or plank maybe) in a metric that tracks job results?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right: pkg/kube/metrics.go

@stmcginnis
Copy link
Contributor

Looks like all comments have been address, and the changes look good to me.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2025
@smg247
Copy link
Contributor

smg247 commented Nov 17, 2025

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Prucek, smg247

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 17, 2025
@matthyx
Copy link
Contributor

matthyx commented Nov 17, 2025

/retest

@k8s-ci-robot k8s-ci-robot merged commit cfbb719 into kubernetes-sigs:main Nov 17, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/tide Issues or PRs related to prow's tide component cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants